How to gain speedups of 1000 on single processors with fast FEM solvers Benchmarking numerical and computational efficiency

نویسندگان

  • Michael Köster
  • Dominik Göddeke
  • Hilmar Wobker
  • Stefan Turek
چکیده

In Computational Science and in particular in the numerical simulation of PDE problems, optimal serial performance is essential for a successful scale-out to the teraand petascale dimensions. In this paper, we propose a simple yet fundamental benchmark setting for a PDE problem that we believe any reasonably flexible Finite Element based software should be able to handle effortlessly. The Poisson problem used in these tests allows reliable performance estimates for more challenging simulations. Our performance evaluation focuses on numerical methodology and data layouts rather than implementational fine-tuning. To enable a fair and realistic comparison independent of the underlying numerical methodology, we define the metric total efficiency. Results are presented for two different solver classes, multigrid and Krylov-subspace methods, obtained in single-core computations with our solver packages Feat2 and Feast. We quantitatively emphasise the effect of different storage techniques and numbering (reordering) schemes, which constitute the crucial factor in view of the memory wall problem that ultimately determines performance of all Finite Element codes. We demonstrate a speed-up of more than a factor 1000 by migrating from a naive implementation of a standard Krylov solver to a sophisticated implementation of an advanced multigrid solver, without applying any adaptivity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Performance FEM Simulation in CFD and CSM

Processor technology is still dramatically advancing and promises further enormous improvements in processing data for the next decade. In contrast, much lower advances in moving data are expected such that the efficiency of many numerical software tools for Partial Differential Equations (PDEs) is restricted by the cost for memory access. In last year’s Research Report [7] we outlined the nume...

متن کامل

Accelerating high-order WENO schemes using two heterogeneous GPUs

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

متن کامل

Viscous Models Comparison in Water Impact of Twin 2D Falling Wedges Simulation by Different Numerical Solvers

In this paper, symmetric water entry of twin wedges is investigated for deadrise angle of 30 degree. Three numerical simulation of a symmetric impact, considering rigid body dynamic equations of motion in two-phase flow is presented. The two-phase flow around the wedges is solved by Finite Element based on Finite Volume method (FEM-FVM) which is used in conjunction with Volume of Fluid (VOF) sc...

متن کامل

Fast Finite Element Method Using Multi-Step Mesh Process

This paper introduces a new method for accelerating current sluggish FEM and improving memory demand in FEM problems with high node resolution or bulky structures. Like most of the numerical methods, FEM results to a matrix equation which normally has huge dimension. Breaking the main matrix equation into several smaller size matrices, the solving procedure can be accelerated. For implementing ...

متن کامل

Architecting the finite element method pipeline for the GPU

The finite element method (FEM) is a widely employed numerical technique for approximating the solution of partial differential equations (PDEs) in various science and engineering applications. Many of these applications benefit from fast execution of the FEM pipeline. One way to accelerate the FEM pipeline is by exploiting advances in modern computational hardware, such as the many-core stream...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008